5 research outputs found

    Data mining reduction methods and performances of rules

    Get PDF
    In data mining the accuracy of models are associated with the strength of the rules.However, most machine learning techniques produce a large number of rules.The consequence is with large number of rules generated,processing time is much longer. This study examines rules of different lengths of attributes in terms of performance based on percentage of accuracy. The research adopts the Knowledge Discovery in Databases “KDD” methodology for analysis and applies various data mining techniques in the experiments.Data of 50 hardware dataset companies which, contains 31 attributes and 400 records have been used. In summary, results show that in terms of performance of rules, Genetic Algorithm has produced the highest number of rules followed by Johnson’s Algorithm and Holte’s 1R.The best classifier for extracting rules in this study is VOT (Voting of Object Tracking).In terms of performance of rules, best results comes from rules with 30 attributes, followed by rules with 1 intersection attribute and lastly rules with 3 intersection attributes. Among the three sets of attributes, the 3 intersection attributes are considered as the attributes that can be used as predictor attributes

    Integrated bio-search approaches with multi-objective algorithms for optimization and classification problem

    Get PDF
    Optimal selection of features is very difficult and crucial to achieve, particularly for the task of classification. It is due to the traditional method of selecting features that function independently and generated the collection of irrelevant features, which therefore affects the quality of the accuracy of the classification. The goal of this paper is to leverage the potential of bio-inspired search algorithms, together with wrapper, in optimizing multi-objective algorithms, namely ENORA and NSGA-II to generate an optimal set of features. The main steps are to idealize the combination of ENORA and NSGA-II with suitable bio-search algorithms where multiple subset generation has been implemented. The next step is to validate the optimum feature set by conducting a subset evaluation. Eight (8) comparison datasets of various sizes have been deliberately selected to be checked. Results shown that the ideal combination of multi-objective algorithms, namely ENORA and NSGA-II, with the selected bio-inspired search algorithm is promising to achieve a better optimal solution (i.e. a best features with higher classification accuracy) for the selected datasets. This discovery implies that the ability of bio-inspired wrapper/filtered system algorithms will boost the efficiency of ENORA and NSGA-II for the task of selecting and classifying features

    Optimization of attribute selection model using bio-inspired algorithms

    Get PDF
    Attribute selection which is also known as feature selection is an essential process that is relevant to predictive analysis.To date, various feature selection algorithms have been introduced, nevertheless they all work independently. Hence, reducing the consistency of the accuracy rate. The aim of this paper is to investigate the use of bio-inspired search algorithms in producing optimal attribute set. This is achieved in two stages; 1) create attribute selection models by combining search method and feature selection algorithms, and 2) determine an optimized attribute set by employing bio-inspired algorithms.Classification performance of the produced attribute set is analyzed based on accuracy and number of selected attributes. Experimental results conducted on six (6) public real datasets reveal that the feature selection model with the implementation of bio-inspired search algorithm consistently performs good classification (i.e higher accuracy with fewer numbers of attributes) on the selected data set. Such a finding indicates that bio-inspired algorithms can contribute in identifying the few most important features to be used in data mining model construction

    OPTIMIZATION OF ATTRIBUTE SELECTION MODEL USING BIO-INSPIRED ALGORITHMS

    Get PDF
    Attribute selection which is also known as feature selection is an essential process that is relevant to predictive analysis. To date, various feature selection algorithms have been introduced, nevertheless they all work independently. Hence, reducing the consistency of the accuracy rate. The aim of this paper is to investigate the use of bio-inspired search algorithms in producing optimal attribute set. This is achieved in two stages; 1) create attribute selection models by combining search method and feature selection algorithms, and 2) determine an optimized attribute set by employing bio-inspired algorithms. Classification performance of the produced attribute set is analyzed based on accuracy and number of selected attributes. Experimental results conducted on six (6) public real datasets reveal that the feature selection model with the implementation of bio-inspired search algorithm consistently performs good classification (i.e higher accuracy with fewer numbers of attributes) on the selected data set. Such a finding indicates that bio-inspired algorithms can contribute in identifying the few most important features to be used in data mining model construction.

    Ideal combination feature selection model for classification problem based on bio-inspired approach

    No full text
    Feature selection or attribute reduction is a crucial process to achieve optimal data reduction for classification task. However, most of the feature selection methods that were introduced work individually that sometimes caused less optimal feature being selected, subsequently degrading the consistency of the classification accuracy rate. The aim of this paper is to exploit the capability of bio-inspired search algorithms, together with wrapper and filtered methods in generating optimal set of features. The important step is to idealize the combined feature selection models by finding the best combination of search method and feature selection algorithms. The next step is to define an optimized feature set for classification task. Performance metrics are analyzed based on classification accuracy and the number of selected features. Experiments were conducted on nine (9) benchmark datasets with various sizes, categorized as small, medium and large dataset. Experimental results revealed that the ideal combination is a feature selection model with the implementation of bio-inspired search algorithm that consistently obtains the optimal solution (i.e. less number of features with higher classification accuracy) on the selected dataset. Such a finding indicates that the exploitation of bio-inspired algorithms with ideal combination of wrapper/filtered method can contribute in finding the optimal features to be used in data mining model construction
    corecore